The complete mitochondrial genome of freshwater gammarid Gammarus nipponensis (Crustacea: Amphipoda: Gammaridae)

Abstract This study presents the complete mitochondrial genome sequence of Gammarus nipponensis, a freshwater crustacean found in the western regions of Honshu, Shikoku and Kyushu in Japan. The entire genome is 16,429 bp in length, encoding a standard set of 13 protein-coding genes, two ribosomal RNA genes and 22 transfer RNA genes, as well as the putative control regions. The mitochondrial genome of G. nipponensis is characterized by a high concentration of A and T nucleotides (67.1%). Notably, the mitogenome contains long TATTTTA repeats in the control region 2 at 686 bp long. This newly available genome information will be useful for studying the evolutionary relationships within the genus Gammarus and for understanding diversification among G. nipponensis populations.


Introduction
The genus Gammarus Fabricius 1775, is considered one of the most diverse genera of crustaceans (V€ ain€ ol€ a et al. 2008).In Japan, three Gammarus species have been recorded: Gammarus nipponensis U� eno 1940, Gammarus koreanus U� eno 1940 and Gammarus mukudai Tomikawa et al. 2014.Among these, G. nipponensis has the widest distribution, inhabiting western Honshu, Shikoku and Kyushu (Tomikawa 2017).These Gammarus species are exclusively found in freshwater habitats which limits dispersal.Moreover, the life cycle of gammarids involves direct development, where the fertilized eggs are brooded until the juvenile stage (Hyne 2011;Bastos-Pereira and Bueno 2016).Given these characteristics, G. nipponensis can be used to examine the effect of habitat fragmentation on the degree of genetic variation within and among their populations.
Previous study on the phylogenetics of G. nipponensis relied on cox1 and 28s rRNA genes (Tomikawa et al. 2014) and the classification and phylogenetic status of genus Gammarus remains problematic (Hou et al. 2007;2014, Hou and Sket 2016, Macher et al. 2017).As a result, the full mitochondrial genome of G. nipponensis can give useful resources for additional phylogeographic information.Furthermore, this is the first complete mitochondrial genome for genus Gammarus in Japan.

Materials and methods
The specimen of G. nipponensis (Figure 1) was obtained from Kuro River in Asakura, Fukuoka (33 � 23'56.8"N,130 � 48'19.6"E)and immediately preserved in 95% ethanol.The species identification was based on the morphological characters as described in Tomikawa et al. (2014).According to the reference, the accessory flagellum is present on antenna 1, and the posterior margins of peduncular articles 4 and 5 in antenna 2 have 6 and 7 clusters of long setae, respectively.
The extracted DNA was submitted to Bioengineering Lab.Co., Ltd.(Sagamihara, Japan) for library construction and sequencing.Library preparation was done according to MGIEasy FS DNA Library Prep Set manual, and high-throughput DNA sequencing was performed using the DNBSEQ-G400 system (MGI Tech, Shenzhen, China).Raw reads (25,350,425 pairs of 200 bp) from the sequencing process were filtered using Sickle ver.1.33 (Joshi and Fass 2011) to remove adapter sequences and reads with quality score less than 30.The resulting high-quality reads were assembled using metamode of SPAdes ver.3.14.0(Nurk et al. 2017), then annotated using the MITOS web server (Bernt et al. 2013) and sequentially manually corrected to improve accuracy.Results of the short read assembly revealed the presence of tandem repeats of TATTTTA in the mitochondrial genome of this species.PCR was conducted across the repeats using primers (Gnip CR2F: 5 0 -GTGATTACAATAACATTTGGCGGC-3 0 and Gnip CR2R: 5 0 -GCTATAAGGCTGACCTCAAGGC-3 0 ) and showed that the length of repeats was approximately 650 bp.Since its precise length could not be determined by Sanger sequencing, the number of repeats was confirmed using MinION sequencer (Oxford Nanopore Technologies) equipped with a FLO-MIN106D flow cell and SQK-RAD004 kit.The long reads were assembled with Flye ver.2.9.1 (Kolmogorov et al. 2019), then the assembled contig was polished with Pilon ver.1.22 using the short reads (Walker et al. 2014).
The read coverage depth obtained for the assembled mitogenome generated in DNBSEQ short reads was about 400x, with the lowest coverage depth (1x) at the repetitive region of CR2 (Figure S1).Improved coverage depth (�1000x) in CR2 was achieved from Oxford Nanopore long reads, and the alignment in CR2 (Figure S2) suggests the reliability of the final assembly.Read coverage depth and circular map of the mitochondrial genome was visualized using the integrative genomics viewer (Thorvaldsd� ottir et al. 2013) and Proksee server at https://proksee.ca/projects(Grant et al. 2023), respectively.The annotated sequence was submitted to NCBI GenBank with accession number OR268765.2.
Phylogenetic analysis was performed using nucleotide sequences from 13 protein-coding genes (PCG), encompassing all species within Gammaroidea (Taxonomy ID: 44329) available in the NCBI database.Two Lysianassidira species, Alicella gigantea Chevreux 1899 and Eurythenes maldoror D'Udekem d'Acoz and Havermans 2015, were selected as outgroups due to their sister relationship with Gammaroidea (Macher et al. 2023).The nucleotide sequences of each gene were aligned using the "auto" strategy in MAFFT ver.7 (Katoh et al. 2019) and manually trimmed the ends.Phylogenetic trees were then generated using maximum likelihood method with the concatenated dataset, employing 10,000 ultrafast bootstrap replicates in IQ-TREE ver.1.6.12(Nguyen et al. 2015).To determine the best-fit substitution model, ModelFinder in IQ-TREE was employed under the Bayesian Information Criterion (Kalyaanamoorthy et al. 2017).

Results
The mitochondrial genome of G. nipponensis is 16,429 bp in length and contains nad4L,CytB,atp6 and atp8), two ribosomal RNAs (16S and 12S), 22 transfer RNAs (two for Leu and Ser and one for each amino acid) and two putative control regions (CR) (Figure 2).The CR in G. nipponensis are found in two positions: CR1 -located after the 12S gene and CR2 -situated between tRNA-Tyr and tRNA-Ile, with a combined length of about 2200 bp.Interestingly, the mitochondrial genome of G. nipponensis contained 98 TATTTTA repeats in CR2.Among the genes, nine of the 13 PCG and 14 of the 22 tRNAs are encoded by the forward (þ) strand, while the remaining genes are encoded by the reverse (-) strand.
Molecular phylogenetic analysis showed that the mitochondrial genome sequence assembled in this study is positioned within the superfamily Gammaroidea with close affinity to Gammarus pisinnus (Figure 3).

Discussion and conclusion
Mitochondrial genes of G. nipponensis are encoded similar to the putative ancestral pancrustacean ground pattern, however, there are differences in the order of tRNAs (Ser, Glu, Arg and Gly) and the occurrence of dual CR.Translocation of tRNAs is commonly observed in gammarids (Krebes and Bastrop 2012;Cormier et al. 2018;Mamos et al. 2021).Two CRs are also reported in G. roeselii, but these regions displayed nearly identical sequences unlike in our study species (Cormier et al. 2018).Another interesting feature of G. nipponensis mitogenome is the presence of tandem repeats within the CR.Although tandem repeats are also found in Gammarus duebeni, these are less abundant comprising six imperfect tandem repeats at 84 to 97 bp long, in contrast to those in G. nipponensis (Krebes and Bastrop 2012).Phylogenetic analysis suggested that G. nipponensis is closely related to G. pisinnus both of which are native to East Asia (Sun et al. 2020).Previous studies (Sun et al. 2020;Mamos et al. 2021;Macher et al. 2023) have highlighted the need for additional information on mitogenome sequences of gammarids for their more detailed phylogenetic relationship.Thus, the complete mitogenome of G. nipponensis contributes to such investigations.Also, the information from this study may lead to the development of improved tools for phylogenetic analysis among the populations of G. nipponensis.

Figure 2 .
Figure 2. Mitochondrial genome map of Gammarus nipponensis showing the arrangement of PCGs, tRNAs, rRNAs and putative control regions.